Privacy by design & by default
The GDPR requires that business processes feature data protection in front of mind. Privacy as a concept is part of the first step in any design of functionality at Apteco.
Here are several techniques that can be applied either within or before/after Apteco software. See GDPR Article 25 and 32.
Pseudonymisation
You can separate personal data from analysis variables before loading data into the Apteco software, using a pseudonym as an identifier. You can re-link pseudonymised data with the personal data before sending to the broadcaster or lettershop. You could use an existing variable like customer number or a new one (e.g. hash value).
Optionally ‘organisational separation’ via a so-called 3rd party ‘trust centre’ can help make this process even more robust although this adds an extra level of complexity and a potential time lag to the process. While this isn’t a process handled directly in the core Apteco software, it is a technique that you can apply before loading data. Personal Data separation in the Data Management or SCV step, or immediately after data export out of it. Alternatively if you do not use a trust centre you could use organisational separation and allocate functional teams within your company.
Whilst the Trust Centre and post-processing re-personalisation is a useful technique it does lend itself more to offline, time-insensitive communications and larger enterprise environments with multiple organisational units rather than time-sensitive digital communications and/or smaller companies.
Differential privacy
Differential privacy aims to provide a means to maximise the accuracy of queries from statistical databases while minimising the chances of identifying the natural person behind the records. The following two techniques are available in Apteco software:
-
Minimum visible count
-
Noise scale
Minimum visible count
The minimum visible count (MVC) property makes it harder to identify an individual by a process of elimination. For example, looking at a FastStats system based on a consumer holiday database you might be able to "find" one of your neighbours by specifying your village, the person's age group and their last destination adding criteria until you isolate a selection of 1. You could then read off other information, e.g. next destination, or personal preferences. MVC makes this harder by hiding any selection count or cube cell count that is below the minimum (say 10). As soon as the count is below the MVC, it appears as zero.
Noise scale
The noise scale property sets the amplitude or "random noise" added to aggregated function results in cube cells (sum, mean. maximum etc.). The idea is to prevent the user gaining information on any individual by introducing a small amount of uncertainty into the results. With a small amount of noise, the overall trends should still be evident. The noise emulates a Laplace distribution - this gives values that are more likely to be closer to the correct value. (A uniform random distribution gives values that are equally likely across the range of the noise.) An additional time bias is included in the noise to make it harder to estimate exact figures by averaging multiple analysis runs.
Important: Only use this feature if absolute accuracy is not essential.
Configuration
Follow these steps to configure the minimum visible count and noise scale properties:
-
Open the FastStats Configurator.
Tip: Search for Fast in the Start Menu.
-
Select FastStats Services.
-
Select the FastStats Service for the system, then click Properties.
-
Select FastStats from the General tree and click Connection Settings.
-
Select the Show advanced settings check box.
-
Scroll down to find Minimum Visible Count and Noise Scale.
Note: When noise scale is enabled it will prevent the “Why was I Selected?” function from working.
Data minimisation
Data processing should only use as much data as is required to accomplish a given task successfully; the GDPR states that you should adhere to the principle of data minimisation.
Below are a few suggestions:
Data Model
-
Pseudonymisation (see above). Prevent personal data entering the analysis system upstream of Apteco software. This is probably the most robust technique.
-
Minimise exposure to granular personal data by users using the combine categories banding wizard in FastStats, (e.g. Date of Birth -> Age banding and removing YY from DD.MM.YY for birthday mailing trigger)
Data Retention Policy
Each company must define its policy on how long to store data before deleting it. You can manage this in the upstream databases (SCV or CRM).
Data Access
Access to personal data can also be controlled by restricting Who has access to What?
-
Directory access rights
You can control a group or a user’s Read, Write, and Delete permissions at a high level.
-
You can use Row and Column filters to control whether a variable is Selectable, Exportable, or Browsable (SEB visibility), right-click under the Visibility column for a variable.
Data Rights
A key strength of FastStats. You can apply data rights at the Variable level using Designer, this limits what is returned to the user.
Data Properties
From within the FastStats System Explorer you can also limit how a variable is used.
-
Right-click on a variable > Properties > Security.
Export Restrictions
You can restrict the output file types that a user or group of users can export.
You can delete the output option for a filetype
Note: Data is exported to the Server, not to the User’s PC.
Velocity checking
Velocity checking is a way of restricting the volumes of data users can export from a FastStats system and restrictions on the time periods for exporting. If a User exceeds their limit an export will be quarantined, the user can ask an administrator for an authorisation code to produce that export.
You can also set a size limit for the Sample Download file in the PeopleStage Delivery Step, accessed in the FastStats Configurator > FastStats Service > (select the system) > Properties, PeopleStage node.
Audit trails
The FastStats Variable scanner can be used to show what variable has been used when, for example, to identify which variables have never or rarely been used in any analyses and as such you should delete these variables from the data model (-> data minimisation).
To use and install the FastStats Variable Scanner, download the Utilities zip and install the Variable Scanner on the same machine as you’re FastStats system, the scanner analyses one system and creates an output that you use with Designer to build another system. You need Excelsior to generate the reports from the Variable Scanner, so you will need to set up another FastStats system and have appropriate Licence files.
You then perform the analysis in that new system and can view results in Excelsior.
Permission preferences
Permission preferences are managed outside of Apteco software; however you can integrate opt in preferences into the data model to be brought in from the upstream database.
Ensure permission preferences are applied in a timely fashion as often there is a timelag between Selections and Campaign Execution (or within a multi-step campaign).
-
Upfront at time of selection, using permission flags in the selection.
-
Update of permission flags between data updates in FastStats e.g. every 2 hours through the day.
-
Via external variables in FastStats (only works for existing URNs).
-
Reapplication of permission at campaign run time in PeopleStage using (Universal) Area constraint filters e.g. “Valid E-Mail Opt-in”.